18 research outputs found

    Contrastive Learning for Lifted Networks

    Full text link
    In this work we address supervised learning of neural networks via lifted network formulations. Lifted networks are interesting because they allow training on massively parallel hardware and assign energy models to discriminatively trained neural networks. We demonstrate that the training methods for lifted networks proposed in the literature have significant limitations and show how to use a contrastive loss to address those limitations. We demonstrate that this contrastive training approximates back-propagation in theory and in practice and that it is superior to the training objective regularly used for lifted networks.Comment: 9 pages, BMVC 201

    Geometric Variational Models for Inverse Problems in Imaging

    Get PDF
    This dissertation develops geometric variational models for different inverse problems in imaging that are ill-posed, designing at the same time efficient numerical algorithms to compute their solutions. Variational methods solve inverse problems by the following two steps: formulation of a variational model as a minimization problem, and design of a minimization algorithm to solve it. This dissertation is organized in the same manner. It first formulates minimization problems associated with geometric models for different inverse problems in imaging, and it then designs efficient minimization algorithms to compute their solutions. The minimization problem summarizes both the data available from the measurements and the prior knowledge about the solution in its objective functional; this naturally leads to the combination of a measurement or data term and a prior term. Geometry can play a role in any of these terms, depending on the properties of the data acquisition system or the object being imaged. In this context, each chapter of this dissertation formulates a variational model that includes geometry in a different manner in the objective functional, depending on the inverse problem at hand. In the context of compressed sensing, the first chapter exploits the geometric properties of images to include an alignment term in the sparsity prior of compressed sensing; this additional prior term aligns the normal vectors of the level curves of the image with the reconstructed signal, and it improves the quality of reconstruction. A two-step recovery method is designed for that purpose: first, it estimates the normal vectors to the level curves of the image; second, it reconstructs an image matching the compressed sensing measurements, the geometric alignment of normals, and the sparsity constraint of compressed sensing. The proposed method is extended to non-local operators in graphs for the recovery of textures. The harmonic active contours of Chapter 2 make use of differential geometry to interpret the segmentation of an image as a minimal surface manifold. In this case, geometry is exploited in both the measurement term, by coupling the different image channels in a robust edge detector, and in the prior term, by imposing smoothness in the segmentation. The proposed technique generalizes existing active contours to higher dimensional spaces and non-flat images; in the plane, it improves the segmentation of images with inhomogeneities and weak edges. Shape-from-shading is investigated in Chapter 3 for the reconstruction of a silicon wafer from images of printed circuits taken with a scanning electron microscope. In this case, geometry plays a role in the image acquisition system, that is, in the measurement term of the objective functional. The prior term involves a smoothness constraint on the surface and a shape prior on the expected pattern in the circuit. The proposed reconstruction method also estimates a deformation field between the ideal pattern design and the reconstructed surface, substituting the model of shape variability necessary in shape priors with an elastic deformation field that quantifies deviations in the manufacturing process. Finally, the techniques used for the design of efficient numerical algorithms are explained with an example problem based on the level set method. To this purpose, Chapter 4 develops an efficient algorithm for the level set method when the level set function is constrained to remain a signed distance function. The distance function is preserved by the introduction of an explicit constraint in the minimization problem, the minimization algorithm is efficient by the adequate use of variable-splitting and augmented Lagrangian techniques. These techniques introduce additional variables, constraints, and Lagrange multipliers in the original minimization problem, and they decompose it into sub-optimization problems that are simple and can be efficiently solved. As a result, the proposed algorithm is five to six times faster than the original algorithm for the level set method

    SELECTING RELEVANT VISUAL FEATURES FOR SPEECHREADING

    Get PDF
    A quantitative measure of relevance is proposed for the task of constructing visual feature sets which are at the same time relevant and compact. A feature's relevance is given by the amount of information that it contains about the problem, while compactness is achieved by preventing the replication of information between features in the set. To achieve these goals, we use mutual information both for assessing relevance and measuring the redundancy between features. Our application is speechreading, that is, speech recognition performed on the video of the speaker. This is justified by the fact that the performance of audio speech recognition can be improved by augmenting the audio features with visual ones, especially when there is noise in the audio channel. We report significant improvements compared to the most commonly used method of dimensionality reduction for speechreading, linear discriminant analysis

    On dynamic stream weighting for Audio-Visual Speech Recognition

    Get PDF
    The integration of audio and visual information improves speech recognition performance, specially in the presence of noise. In these circumstances it is necessary to introduce audio and visual weights to control the contribution of each modality to the recognition task. We present a method to set the value of the weights associated to each stream according to their reliability for speech recognition, allowing them to change with time and adapt to different noise and working conditions. Our dynamic weights are derived from several measures of the stream reliability, some specific to speech processing and others inherent to any classification task, and take into account the special role of silence detection in the definition of audio and visual weights. In this paper we propose a new confidence measure, compare it to existing ones and point out the importance of the correct detection of silence utterances in the definition of the weighting system. Experimental results support our main contribution: the inclusion of a voice activity detector in the weighting scheme improves speech recognition over different system architectures and confidence measures, leading to an increase in performance more relevant than any difference between the proposed confidence measures

    Surface reconstruction from microscopic images in optical lithography

    Get PDF
    We propose a shape-from-shading method to reconstruct surfaces of silicon wafers from images of printed circuits taken with scanning electron microscope. Our method combines the physical model of the optical acquisition system with prior knowledge about the shapes of the patterns in the circuit. The reconstruction of the surface is formulated as an optimization problem with a combined criterion based on the irradiance equation and a shape prior that constrains the shape of the surface to agree with the expected shape of the pattern. To account for the variability of the manufacturing process, the model allows a non-linear elastic deformation between the expected patterns and the reconstructed surface. Our method provides two outputs: a reconstructed surface and a deformation field. The reconstructed surface is derived from the shading observed in the images and the prior knowledge about circuit patterns, which results in a shape-from-shading technique stable and robust to noise. The deformation field produces a mapping between the expected shape and the reconstructed surface, which provides a measure of deviation between the models and the real manufacturing process

    VolTeMorph: Realtime, Controllable and Generalisable Animation of Volumetric Representations

    Full text link
    The recent increase in popularity of volumetric representations for scene reconstruction and novel view synthesis has put renewed focus on animating volumetric content at high visual quality and in real-time. While implicit deformation methods based on learned functions can produce impressive results, they are `black boxes' to artists and content creators, they require large amounts of training data to generalise meaningfully, and they do not produce realistic extrapolations outside the training data. In this work we solve these issues by introducing a volume deformation method which is real-time, easy to edit with off-the-shelf software and can extrapolate convincingly. To demonstrate the versatility of our method, we apply it in two scenarios: physics-based object deformation and telepresence where avatars are controlled using blendshapes. We also perform thorough experiments showing that our method compares favourably to both volumetric approaches combined with implicit deformation and methods based on mesh deformation.Comment: 18 pages, 21 figure

    Contrastive learning for lifted networks

    No full text
    In this work we address supervised learning via lifted network formulations. Lifted networks are interesting because they allow training on massively parallel hardware and assign energy models to discriminatively trained neural networks. We demonstrate that training methods for lifted networks proposed in the literature have significant limitations, and therefore we propose to use a contrastive loss to train lifted networks. We show that this contrastive training approximates back-propagation in theory and in practice, and that it is superior to the regular training objective for lifted networks
    corecore